73 research outputs found

    Convergence of Tomlin's HOTS algorithm

    Full text link
    The HOTS algorithm uses the hyperlink structure of the web to compute a vector of scores with which one can rank web pages. The HOTS vector is the vector of the exponentials of the dual variables of an optimal flow problem (the "temperature" of each page). The flow represents an optimal distribution of web surfers on the web graph in the sense of entropy maximization. In this paper, we prove the convergence of Tomlin's HOTS algorithm. We first study a simplified version of the algorithm, which is a fixed point scaling algorithm designed to solve the matrix balancing problem for nonnegative irreducible matrices. The proof of convergence is general (nonlinear Perron-Frobenius theory) and applies to a family of deformations of HOTS. Then, we address the effective HOTS algorithm, designed by Tomlin for the ranking of web pages. The model is a network entropy maximization problem generalizing matrix balancing. We show that, under mild assumptions, the HOTS algorithm converges with a linear convergence rate. The proof relies on a uniqueness property of the fixed point and on the existence of a Lyapunov function. We also show that the coordinate descent algorithm can be used to find the ideal and effective HOTS vectors and we compare HOTS and coordinate descent on fragments of the web graph. Our numerical experiments suggest that the convergence rate of the HOTS algorithm may deteriorate when the size of the input increases. We thus give a normalized version of HOTS with an experimentally better convergence rate.Comment: 21 page

    Parallel coordinate descent for the Adaboost problem

    Full text link
    We design a randomised parallel version of Adaboost based on previous studies on parallel coordinate descent. The algorithm uses the fact that the logarithm of the exponential loss is a function with coordinate-wise Lipschitz continuous gradient, in order to define the step lengths. We provide the proof of convergence for this randomised Adaboost algorithm and a theoretical parallelisation speedup factor. We finally provide numerical examples on learning problems of various sizes that show that the algorithm is competitive with concurrent approaches, especially for large scale problems.Comment: 7 pages, 3 figures, extended version of the paper presented to ICMLA'1

    PageRank optimization applied to spam detection

    Full text link
    We give a new link spam detection and PageRank demotion algorithm called MaxRank. Like TrustRank and AntiTrustRank, it starts with a seed of hand-picked trusted and spam pages. We define the MaxRank of a page as the frequency of visit of this page by a random surfer minimizing an average cost per time unit. On a given page, the random surfer selects a set of hyperlinks and clicks with uniform probability on any of these hyperlinks. The cost function penalizes spam pages and hyperlink removals. The goal is to determine a hyperlink deletion policy that minimizes this score. The MaxRank is interpreted as a modified PageRank vector, used to sort web pages instead of the usual PageRank vector. The bias vector of this ergodic control problem, which is unique up to an additive constant, is a measure of the "spamicity" of each page, used to detect spam pages. We give a scalable algorithm for MaxRank computation that allowed us to perform experimental results on the WEBSPAM-UK2007 dataset. We show that our algorithm outperforms both TrustRank and AntiTrustRank for spam and nonspam page detection.Comment: 8 pages, 6 figure

    A generic coordinate descent solver for nonsmooth convex optimization

    Get PDF
    International audienceWe present a generic coordinate descent solver for the minimization of a nonsmooth convex objective with structure. The method can deal in particular with problems with linear constraints. The implementation makes use of efficient residual updates and automatically determines which dual variables should be duplicated. A list of basic functional atoms is pre-compiled for efficiency and a modelling language in Python allows the user to combine them at run time. So, the algorithm can be used to solve a large variety of problems including Lasso, sparse multinomial logistic regression, linear and quadratic programs

    A Smooth Primal-Dual Optimization Framework for Nonsmooth Composite Convex Minimization

    Get PDF
    We propose a new first-order primal-dual optimization framework for a convex optimization template with broad applications. Our optimization algorithms feature optimal convergence guarantees under a variety of common structure assumptions on the problem template. Our analysis relies on a novel combination of three classic ideas applied to the primal-dual gap function: smoothing, acceleration, and homotopy. The algorithms due to the new approach achieve the best known convergence rate results, in particular when the template consists of only non-smooth functions. We also outline a restart strategy for the acceleration to significantly enhance the practical performance. We demonstrate relations with the augmented Lagrangian method and show how to exploit the strongly convex objectives with rigorous convergence rate guarantees. We provide numerical evidence with two examples and illustrate that the new methods can outperform the state-of-the-art, including Chambolle-Pock, and the alternating direction method-of-multipliers algorithms.Comment: 35 pages, accepted for publication on SIAM J. Optimization. Tech. Report, Oct. 2015 (last update Sept. 2016

    Smoothing technique for nonsmooth composite minimization with linear operator

    Full text link
    We introduce and analyze an algorithm for the minimization of convex functions that are the sum of differentiable terms and proximable terms composed with linear operators. The method builds upon the recently developed smoothed gap technique. In addition to a precise convergence rate result, valid even in the presence of linear inclusion constraints, this new method allows an explicit treatment of the gradient of differentiable functions and can be enhanced with line-search. We also study the consequences of restarting the acceleration of the algorithm at a given frequency. These new features are not classical for primal-dual methods and allow us to solve difficult large-scale convex optimization problems. We numerically illustrate the superior performance of the algorithm on basis pursuit, TV-regularized least squares regression and L1 regression problems against the state-of-the-art.Comment: 26 pages, 5 figure
    • …
    corecore